Conditional Independence Trees

نویسندگان

  • Harry Zhang
  • Jiang Su
چکیده

It has been observed that traditional decision trees produce poor probability estimates. In many applications, however, a probability estimation tree (PET) with accurate probability estimates is desirable. Some researchers ascribe the poor probability estimates of decision trees to the decision tree learning algorithms. To our observation, however, the representation also plays an important role. Indeed, the representation of decision trees is fully expressive theoretically, but it is often impractical to learn such a representation with accurate probability estimates from limited training data. In this paper, we extend decision trees to represent a joint distribution and conditional independence, called conditional independence trees (CITrees), which is a more suitable model for PETs. We propose a novel algorithm for learning CITrees, and our experiments show that the CITree algorithm outperforms C4.5 and naive Bayes significantly in classification accuracy.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Efficient Independence Sampler for Updating Branches in Bayesian Markov chain Monte Carlo Sampling of Phylogenetic Trees

Sampling tree space is the most challenging aspect of Bayesian phylogenetic inference. The sheer number of alternative topologies is problematic by itself. In addition, the complex dependency between branch lengths and topology increases the difficulty of moving efficiently among topologies. Current tree proposals are fast but sample new trees using primitive transformations or re-mappings of o...

متن کامل

Mixed Roman domination and 2-independence in trees

‎‎Let $G=(V‎, ‎E)$ be a simple graph with vertex set $V$ and edge set $E$‎. ‎A {em mixed Roman dominating function} (MRDF) of $G$ is a function $f:Vcup Erightarrow {0,1,2}$ satisfying the condition that every element $xin Vcup E$ for which $f(x)=0$ is adjacent‎‎or incident to at least one element $yin Vcup E$ for which $f(y)=2$‎. ‎The weight of an‎‎MRDF $f$ is $sum _{xin Vcup E} f(x)$‎. ‎The mi...

متن کامل

The Relevance of Trees

In my contribution to this workshop, I would like to discuss concepts of relevance that emerge from thinking of trees as basic to probability. Some of these concepts are developed in a monograph entitled The Art of Causal Conjecture, which I am finishing this summer and which will be published by MIT Press in their AI series, but I hope to go beyond the monograph in dealing the concept of relev...

متن کامل

Int Reduction

Naive-Bayes induction algorithms were previously shown to be surprisingly accurate on many classification tasks even when the conditional independence assumption on which they are based is violated. However, most studies were done on small databases. We show that in some larger databases, the accuracy of Naive-Bayes does not scale up as well as decision trees. We then propose a new algorithm, N...

متن کامل

Propositional Independence Conditional independence

Independence – the study of what is relevant to a given problem of reasoning – is an important AI topic. In this paper, we investigate several notions of conditional independence in propositional logic: Darwiche and Pearl’s conditional independence, and two more restricted forms of it, called strong conditional independence and perfect conditional independence. Many characterizations and proper...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004